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ABSTRACT 


As  unmanned  aerial  vehicle  (UAV)  technology  and 
availability  improves,  it  becomes  increasingly  more 
important  to  operate  UAVs  efficiently.  Utilizing  one  UAV  at 
a  time  is  a  relatively  simple  task,  but  when  multiple  UAVs 
need  to  be  coordinated,  optimal  search  plans  can  be 
difficult  to  create  in  a  timely  manner.  In  this  thesis,  we 
create  a  decision  aid  that  generates  efficient  routes  for 
multiple  UAVs  using  dynamic  programming  and  a  limited-look¬ 
ahead  heuristic.  The  goal  is  to  give  the  user  the  best 
knowledge  of  the  locations  of  an  arbitrary  number  of  targets 
operating  on  a  specified  graph  of  nodes  and  arcs.  The 
decision  aid  incorporates  information  about  detections  and 
nondetections  and  determines  the  probabilities  of  target 
locations  using  Bayesian  updating.  Target  movement  is 
modeled  by  a  Markov  process.  The  decision  aid  has  been 
tested  in  two  multi-hour  field  experiments  involving  actual 
UAVs  and  moving  targets  on  the  ground. 
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EXECUTIVE  SUMMARY 


Since  its  conception,  the  unmanned  aerial  vehicle  (UAV) 
has  been  a  coveted  battlefield  asset.  The  ability  of  these 
vehicles  to  perform  reconnaissance  and  attack  missions  while 
keeping  the  operators  directly  out  of  harm' s  way  creates  an 
advantage  in  the  domains  of  information  gathering  and  force 
protection.  UAVs  have  only  recently  been  introduced  on  the 
battlefield  in  significant  numbers,  and  the  ability  to 
operate  multiple  UAVs  efficiently  and  effectively  can  be 
improved  further. 

This  thesis  creates  a  decision  aid  that  provides 
efficient  search  routes  for  multiple  UAVs  searching  for 
multiple  targets  operating  on  a  known  graph  of  nodes  and 
arcs.  The  decision  aid  dynamically  provides  estimates  of 
target  locations  during  its  use. 

The  decision  aid  consists  of  a  dynamic  program  that  is 
solved  approximately  using  a  two-timestep  look-ahead 
heuristic.  Target  location  probabilities  are  computed  using 
Bayesian  updating  based  on  the  detections  and  nondetections 
from  the  previous  timestep.  The  decision  aid  includes  the 
possibility  for  UAVs  to  go  on  and  offline  due  to  mechanical 
difficulties  or  limited  endurance. 

The  decision  aid  was  tested  in  two  field  experiments  at 

Camp  Roberts,  California,  as  part  of  the  USSOCOM-NPS  Field 

Experimentation  Program.  The  field  experiments  included  up 

to  three  UAVs  and  five  target  vehicles.  For  the  second 

experiment,  a  prototype  of  the  decision  aid  running  through 

a  Microsoft  Excel  user-interface  was  used.  The  interface 

proved  to  be  highly  effective  in  communicating  to  the  user 

xiii 


the  current  knowledge  of  target  locations  and  provided 
timely  recommendations  for  the  UAV  operators. 


xiv 


ACKNOWLEDGMENT  S 


We  would  like  to  especially  thank  professors  Royset, 
Kress,  and  Chung  for  all  of  their  help  in  the  design  and 
experimentation  of  our  model.  We  would  also  like  to  thank 
Anton  Rowe  for  his  work  on  the  Excel  interface.  Completing 
this  thesis  is  an  important  step  in  furthering  our  education 
and  a  milestone  in  our  young  careers  and  we  could  not  have 
done  it  without  all  of  your  help  and  guidance.  We  would 
also  like  to  thank  all  of  our  other  professors  from  NPS  as 
well  as  our  family  and  friends  who  helped  us  through  this 
difficult  process. 


xv 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


xvi 


I. 


INTRODUCTION 


A.  MOTIVATION  AND  PROBLEM  DEFINITION 

Search  for  moving  targets  arises  in  many  different 
contexts.  For  example,  searching  is  necessary  when  the  goal 
is  to  find  drug  smugglers  or  shot-down  pilots  during  search 
and  rescue  missions.  The  sensors  used  for  these  searches 
are  often  mounted  on  unmanned  aerial  vehicles  (UAVs) ,  thus 
UAVs  become  search  assets.  When  multiple  UAVs  interact 
during  a  search,  there  becomes  a  need  to  effectively  operate 
and  manage  them  within  the  search  environment. 

We  consider  a  finite  number  of  searchers  and  targets 
that  move  on  a  graph  of  nodes  and  arcs.  We  assume  the 
searchers  have  a  close  estimate  of  the  number  of  targets. 
The  targets  remain  within  the  graph  and  move  according  to  a 
known  Markov  process.  The  overall  goal  is  to  route  the 
searchers  during  a  finite  time  horizon  so  that  the  search 
coordinator  gains  the  maximum  situational  awareness  of  all 
targets,  as  quantified  by  probability  distributions  of 
target  locations.  There  are  many  possible  objective 
functions  for  problems  of  this  kind.  We  specifically  aim  to 
maximize  the  expected  number  of  detected  targets  until  the 
finite  time  horizon  while  ignoring  targets  that  are  known  to 
be  located  at  a  given  location  with  a  probability  larger 
than  a  specified  threshold.  Target  thresholds  are  discussed 
in  detail  in  section  A  of  Chapter  II.  We  refer  to  this 
problem  as  the  search  optimization  problem  (SOP) .  In  this 
thesis,  we  develop  a  model  for  SOP  and  a  heuristic  algorithm 
for  obtaining  efficient  search  plans  in  real-time  within  a 


rolling  time  horizon  framework. 
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The  graph  in  SOP  could  represent  a  road  network  where 
nodes  are  intersections  and  arcs  are  roads.  Alternatively, 
the  graph  could  represent  a  grid  of  area  cells  on  the  open 
ocean.  Figure  1  shows  an  example  of  nodes  and  arcs  in  a 
road  network  at  Camp  Roberts,  California. 


Figure  1.  Example  of  Graph. 


Currently,  no  tractable  model  of  SOP  exists  that 
incorporates  all  major  aspects  of  real-world  operations. 
SOP  is  difficult  to  solve  optimally  because  the  optimal  move 
for  the  searchers  at  a  timestep  is  dependent  on  the  future 
searcher  locations  and  actions  as  well  as  target  location 
probabilities.  We  refer  to  such  locations,  actions,  and 
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probabilities  at  a  particular  point  in  time  as  the  "state" 
at  that  time.  This  dependence  on  future  states  requires  the 
use  of  dynamic  programming.  This  situation  tends  to  result 
in  intractable  model  formulations  of  SOP  that  cannot  be 
solved  quickly  enough  for  use  in  a  real-time  decision  aid. 
Dynamic  programming  is  discussed  in  subsection  B2 . 

In  this  thesis,  we  develop  a  new  version  of  a  decision 
aid  called  Aerial  Search  Optimization  Model  (ASOM) ,  see, 
e.g.,  [12]  .  It  consists  of  a  tractable  model  for  SOP,  an 
associated  heuristic  algorithm  for  generating  search 
policies,  and  a  user  interface.  ASOM  is  specifically 
tailored  for  use  by  UAV  operators,  provides  effective  UAV 
routes  quickly,  and  is  relevant  to  many  different  search 
applications . 

B .  FUNDAMENTAL  CONCEPTS 

1 .  Bayesian  Updating 

Bayesian  updating  in  the  context  of  search  is  a  process 
that  begins  with  prior  knowledge  of  target  location 
probabilities,  commonly  referred  to  as  the  a  priori  map. 
This  map  is  based  on  previous  information,  if  such  info 
exists,  or  it  is  assumed  to  be  uniform,  absent  prior 
information.  Figure  2  gives  an  example  of  a  4  cell  a  priori 
map  where  a  single  searcher  is  searching  for  a  single 
stationary  target  known  to  be  present  in  the  map.  In  this 
thesis,  we  account  for  false  negatives,  but  we  assume  that  a 
searcher  will  not  report  a  target  on  a  node  or  an  arc  if 
there  is  no  target  at  that  node  or  arc  (i.e.,  no  false 
positives) .  We  refer  to  Chung  and  Burdick  [3]  for  a 
discussion  of  false  positive  reports.  If  the  searcher  looks 
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in  the  top  left  cell  and  fails  to  find  the  target,  then 
Figure  3  shows  the  resulting  posterior  map  given  the 
searcher  has  a  .5  conditional  probability  of  detection.  The 
posterior  map  is  computed  by  the  following  equation: 


P(4|  v; 

YJP(D'\Ai)P(Al) 

j 

where 

i,j  index  of  target  cells 

P(Ai)  probability  target  is  located  in  area  i 

P(D'\Aj)  probability  of  no  detection  in  cell  i  given 
target  is  in  cell  i 

P(Ai  \D')  probability  target  is  located  in  cell  i 
given  no  detection  is  made  in  that  cell 


For  each  cell,  the  updated  probabilities  are  computed 
by  multiplying  the  probability  of  no  detection  given  there 
is  a  target  in  the  cell  by  the  prior  probability  there  is  a 
target  in  the  cell.  This  number  must  then  be  divided  by  the 
sum  of  these  numbers  for  all  cells  in  order  to  normalize  the 
probabilities.  See  Wagner,  Mylander,  and  Sanders  [20]  for  a 
more  detailed  mathematical  explanation  of  Bayesian  Updating. 


Figure  2.  A  Priori  Target  Distribution. 
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Figure  3  . 


Posterior  Target  Distribution. 
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The  above  discussion  deals  with  "false  negatives," 
which  occur  when  a  searcher  fails  to  detect  a  target  that  is 
actually  there. 

2 .  Dynamic  Programming 

Dynamic  programming  is  a  framework  for  modeling 
decisions  made  over  time  [14]  .  The  state  of  a  dynamic 
program  is  a  snapshot  of  the  system  being  modeled  at  a 
specific  time.  Given  a  finite  time  horizon,  the  backward 
recursion  algorithm  generates  optimal  decisions  at  every 
timestep  starting  from  the  end  and  working  backwards 
assuming  there  are  a  finite  amount  of  states.  However,  this 
involves  examining  all  states  at  each  time  step  and 
determining  the  best  decision  at  that  state. 

The  backward  recursion  algorithm  breaks  down  if  there 
are  an  infinite  number  of  states  and/or  the  determination  of 
the  best  decision  at  a  state  is  a  difficult  optimization 
problem.  In  addition,  it  may  be  problematic  to  use  this 
algorithm  if  the  time  horizon  is  not  known. 

Approximate  dynamic  programming  algorithms  seek  to 
overcome  the  shortcomings  of  the  backward  recursion 
algorithm  by  introducing  approximations.  There  exist  a 
large  number  of  approximate  dynamic  programming  algorithms, 
see,  e.g.,  [14].  Typically,  these  algorithms  step  forward 


5 


in  time.  The  main  difficulty  is  to  determine  the  "value"  of 
transitioning  to  a  specific  state.  One  technique  is  to  use 
a  limited  look-ahead.  This  is  a  process  of  enumerating  all 
possible  moves  for  all  timesteps  of  the  designated  look¬ 
ahead  period  and  making  the  moves  that  achieve  the  greatest 
reward  in  terms  of  the  objective  function.  Longer  look¬ 
ahead  periods  will  better  approximate  the  optimal  dynamic 
programming  solution.  We  will  use  an  approximate  dynamic 
programming  algorithm  because  it  provides  an  effective 
solution  that  can  be  provided  in  real-time,  a  key 
requirement  for  our  implementation. 

C .  PAST  WORKS 

The  goal  of  the  constrained-path,  moving-target  search 
problem  [5,  6,  7,  13,  18,  19,  21]  is  to  find  the  search 
route  that  maximizes  the  probability  of  target  detection 
within  a  fixed  time.  The  classic  setup  involves  a  single 
searcher  and  a  single  target  moving  within  a  finite  number 
of  cells  in  discrete  time.  Both  the  searcher  and  the  target 
are  allowed  to  occupy  a  single  cell  each  timestep,  and 
detections  may  only  occur  when  the  searcher  and  target 
occupy  the  same  cell.  Detection  probabilities  can  be  based 
on  sensor  data  or  derived  from  the  random  search  formula 
[22].  The  target's  probability  distribution  is  maintained 
through  Bayesian  updates  for  nondetection  each  timestep  if 
the  target  is  not  found. 

For  the  classic  constrained-path,  moving-target  search 
problem.  Eagle  and  Yee  [6]  select  a  searcher  route  over  a 
given  number  of  time  periods  that  minimizes  the  probability 
of  nondetection.  Their  formulation  is  a  non-linear  program 
with  linear  constraints,  which  allows  one  to  apply 
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Zangwill's  [24]  Convex  Simplex  Method  (CSM) .  Eagle  and  Yee 
[6]  create  a  myopic  search,  and  while  results  of  their 
example  show  the  CSM  solution  to  always  be  optimal,  the 
myopic  search  may  not  provide  a  good  approximation  of  the 
optimal  solution. 

A  partially  observable  Markov  decision  process  [2]  is 
another  concept  that  has  been  applied  to  the  constrained- 
path,  moving-target  search  problem.  The  idea  is  that  a 
decision  must  be  made  based  on  partial  information,  and  the 
outcome  of  the  decision  is  unknown  until  after  it  has  been 
made.  The  search  application  is  well-suited  for  this  setup 
because  the  searcher  will  have  incomplete  knowledge  of 
target  location  after  each  timestep  based  on  the  updated 
target  probability  distribution.  The  searcher  will  not  know 
whether  or  not  the  search  will  be  successful  until  after  the 
new  search  route  is  chosen. 

Eagle  [5]  provides  an  optimal  solution  technique  using 
dynamic  programming  and  assuming  a  finite  time  horizon.  He 
uses  a  partially  observable  Markov  decision  process,  which 
is  faster  than  standard  linear  programming  methods  because 
total  enumeration  is  limited  to  searching  only  the  cells  one 
can  reach  from  the  searcher's  previous  location.  Stewart 
[18,  19]  creates  an  approximate  solution  procedure  using 
branch-and-bound  techniques.  Eagle  and  Yee  [7]  extend 
Stewart's  work  and  create  a  branch-and-bound  method  that 
produces  optimal  solutions  and  is  faster  than  the  dynamic 
programming  approach.  Washburn  [21]  creates  a  branch-and- 
bound  approach  as  well.  Both  Eagle  and  Yee  [7]  and 
Washburn  [21]  consider  searchers  that  have  continuous  search 
routes.  Other  than  Washburn  [21],  who  accounted  for 
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multiple  searchers,  these  problems  consider  one  searcher 
against  a  single  target  and  provide  optimal  solutions. 

Dell,  Eagle,  Martins,  and  Santos  [4]  extend  the  problem 
to  include  multiple  searchers.  They  create  a  branch-and- 
bound  procedure  to  optimally  solve  the  problem  as  well  as 
six  heuristics  that  take  four  different  approaches  to  the 
problem:  solve  partial  problems  optimally,  maximize  the 
expected  number  of  detections,  implement  a  genetic 
algorithm,  and  use  local  searches  with  random  restarts.  The 
partial  problem  technique  involves  a  moving  horizon  where 
each  one  is  solved  optimally  using  branch-and-bound . 

Members  of  the  autonomous  systems  and  control  community 
have  analyzed  the  multiple  UAV  search  problem  as  well.  Some 
utilize  recursive  Bayesian  filtering  [1,  10]  while  others 
focus  on  cooperative  control  [8,  11]  and  decentralized 
search  [1]  techniques.  Many  of  them  have  considered  the 
problem  of  multiple  searchers  looking  for  multiple  targets 
[1,  8,  9,  10,  11,  23],  which  is  an  extension  to  the  works 
mentioned  above  [5,  6,  7,  13,  18,  19,  21]  .  Fernandez, 
Flint,  and  Polycarpou  [9]  as  well  as  Chung  and  Burdick  [3] 
create  a  Bayesian  method  that  helps  take  into  account  false 
positives . 

Another  consideration  is  using  discrete  time  to  more 
closely  model  continuous  time.  This  situation  occurs  when 
the  travel  time  for  targets  and  searchers  between  cells  is 
not  a  multiple  of  the  discrete  timestep.  Lau,  Huang,  and 
Dissanayake  [13]  enhance  the  branch-and-bound  method  to  take 
into  consideration  the  non-uniformity  of  the  search 
environments.  They  develop  a  new  bound  that  leads  to  faster 
solution  times  as  well  as  the  possibility  of  better 
solutions  when  the  environment  being  modeled  is  spatial- 
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temporal  non-uniform  in  nature.  Sato  and  Royset  [17] 
produce  alternative  bounds  and  even  faster  solutions. 

In  the  near  future,  sufficient  technology  will  exist  to 
allow  the  automatic  detection  of  targets  by  computer 
systems.  When  these  automatic  detections  can  be 
incorporated  within  a  search  program,  it  will  allow  the 
autonomous  routing  of  UAVs .  With  current  technology,  human 
operators  are  required  to  visually  identify  targets.  The 
issue  of  target  detection  can  be  handled  with  a  decision  aid 
that  has  an  input  for  the  detections  made  each  timestep. 

While  many  solutions  have  been  presented  for  the 
constrained-path  moving-target  search  problem  and  some 
research  tools  have  been  developed  for  specific  scenarios 
(see,  e.g.,  [15,  16]),  a  decision  aid  that  can  be  used  in 
real-world  scenarios  has  yet  to  be  fully  developed.  The 
goal  of  our  research  is  to  provide  a  user-friendly  decision 
aid  that  is  capable  of  creating  efficient  UAV  routes  for 
detecting  multiple  targets  operating  on  a  known  graph.  This 
decision  aid  will  be  capable  of  providing  real-time 
effective  decisions  with  computation  times  on  the  order  of 
seconds . 

D.  STRUCTURE  OF  THESIS  AND  CHAPTER  OUTLINE 

This  thesis  is  divided  into  five  chapters,  including 
the  Introduction.  Chapter  II  discusses  the  development  of 
the  model  and  the  dynamic  programming  formulation.  Chapter 
III  introduces  the  actual  algorithm  used  to  implement  our 
model.  Next,  it  analyzes  the  accuracy  and  runtime  of  our 
heuristic  approach.  Finally,  it  discusses  the  Excel  user- 
interface  created  for  our  decision  aid.  Chapter  IV  talks 
about  our  field  experiments  in  Camp  Roberts,  California  and 
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explains  some  of  the  updates  our  decision  aid  underwent  in 
the  process.  Chapter  V  gives  several  conclusions  from  our 
work  as  well  as  recommendations  for  future  work  involving 
ASOM. 
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II.  MODEL 


A.  MODEL  DEVELOPMENT 

We  formulate  a  model  of  SOP  using  dynamic  programming 
with  Bayesian  updating.  We  assume  that  each  target  moves 
according  to  a  Markov  process  and  that  the  targets  move 
independently  of  one  another.  The  presentation  below  and 
our  implementation  of  the  model  assume  that  all  the  Markov 
processes  for  the  various  targets  have  the  same  transition 
matrices.  However,  it  is  trivial  to  extend  this  to  the 
general  case  where  targets  follow  different  movement 
processes.  Targets  are  differentiated  by  their  velocity  and 
type  characteristics  (e.g.,  person  versus  vehicle). 

The  searchers  are  differentiated  by  a  variety  of 
characteristics  including  name,  velocity,  sweepwidth  of 
their  sensors,  and  whether  or  not  they  have  a  camera  with  a 
moving  eye  which  enables  them  to  search  nearby  roads  while 
flying  straight  routes  between  nodes. 

All  dynamic  programming  models  must  have  discrete 
timesteps.  In  our  model,  timesteps  are  used  as  a  discrete 
representation  of  continuous  time.  One  timestep  is  the 
length  of  time  between  each  discretized  value  of  time  with 
smaller  timesteps  being  a  better  approximation  of  continuous 
time . 

Our  dynamic  programming  model  contains  several  states 
that  change  according  to  some  process  as  the  model  advances 
through  time  by  the  use  of  timesteps.  The  state  of  the 
searcher  includes  the  arc  the  searcher  is  currently  on,  the 
amount  of  time  until  the  searcher  reaches  the  head  node  of 
that  arc,  and  the  type  of  move  that  is  currently  being 
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executed.  There  are  three  possible  types  of  moves:  "Road 
Search,"  "Transit,"  and  "Search  at  Location."  "Road  Search" 
means  that  the  searcher  examines  the  road  corresponding  to 
the  current  arc  while  traversing  it.  It  is  possible  to 
detect  targets  on  that  road,  and  any  time  remaining  of  the 
timestep  after  reaching  the  head  node  of  the  arc  is  spent 
searching  that  head  node.  "Transit"  means  that  the  searcher 
flies  a  direct  route  from  the  tail  node  to  the  head  node. 
It  is  not  possible  to  detect  a  target  when  completing  this 
type  of  move,  but  rather  offers  the  possibility  to  reach  the 
head  node  faster  and  allows  more  time  for  search  at  that 
node.  "Search  at  location"  means  that  the  searcher  spends 
the  entire  timestep  searching  its  current  location. 

The  other  main  states  in  the  dynamic  programming  model 
are  the  target  probability  maps.  There  is  one  probability 
map  for  each  target  and  the  entire  map  is  a  matrix  where  the 
entry  in  row  i  and  column  j  represents  the  probability  that 


the  target  is  on  arc  (i,j) 

/if  i  = 

j,  this  represents 

the 

probability 

at  a 

node . 

These 

probability 

maps 

are 

dynamically 

updated 

as  the  model 

transitions 

from 

one 

timestep  to 

another 

The 

updates 

due  to  detections 

and 

nondetections  using  Bayesian  updating  are  first  carried  out. 
Then,  the  updates  due  to  movement  of  targets  by  the  Markov 
process  are  computed. 

More  specifically,  when  detections  are  made,  the 
location  and  type  of  detection  are  inputted  into  the  model. 
The  model  updates  the  target  probability  maps  for  the 
detections  based  on  the  probabilities  of  seeing  different 
targets  at  the  input  detection  locations.  It  looks  at  all 
the  different  "detection  scenarios"  and  determines  the 
probability  of  each  happening  and  decides  which  scenario 
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occurred  based  on  a  random  draw  with  the  associated 
probabilities.  Here,  a  "detection  scenario"  is  an  element 
of  the  set  of  all  the  different  permutations  of  possible 
target  detections  at  each  detection  location.  For  example, 
if  there  are  two  detections  at  time  t  and  three  available 
targets,  the  model  creates  all  possible  permutations  of 
target  detection  scenarios.  In  this  situation,  there  are 
six  possible  scenarios,  three  choices  (possible  targets)  for 
the  first  detection  and  then  two  remaining  choices  (one  of 
the  two  not  found  in  the  first  detection)  for  the  second 
detection.  The  model  then  computes  the  probability  of  each 
of  the  six  different  scenarios  occurring  based  on  the  target 
marginal  probabilities  and  decides  which  one  actually 
occurred  using  a  random  draw  with  the  corresponding 
probabilities . 

We  also  use  the  concept  of  search  thresholds.  This 
threshold  is  a  user  input  between  0  and  1  used  to  determine 
what  level  of  target  knowledge  will  constitute  "knowing" 
where  a  target  is  located.  This  is  an  attempt  to  gain 
better  total  situational  awareness  by  ignoring  targets  that 
we  "know"  are  at  certain  locations.  A  threshold  value  of  1 
creates  a  greedy  policy  where  searchers  will  circle  targets 
unless  a  higher  probability  mass  presents  itself  at  a  nearby 
location.  On  the  contrary,  if  the  threshold  value  is  less 
than  1,  then  targets  whose  maximum  probability  mass  is  above 
that  threshold  will  not  be  searched  for,  resulting  in  a  less 
greedy  policy. 

We  also  calculate  an  aggregate  probability  map  to 
represent  the  normalized  probability  of  all  targets  that  are 
unknown  (i.e.,  do  not  reach  the  threshold)  by  summing  the 
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probability  mass  of  all  unknown  targets  at  each  location  and 
dividing  it  by  the  number  of  these  targets. 

SOP  is  defined  in  terms  of  some  finite  time  horizon. 
This  may  be  related  to  the  endurance  of  the  searchers  (e.g., 
UAV  flight  time)  or  operational  considerations.  In  practice, 
the  time  horizon  may  not  be  completely  known.  Looking 
further  into  the  future  with  a  dynamic  program  will  give 
better  decisions  in  the  current  timestep  than  a  shorter 
look-ahead.  To  limit  computing  time  and  allow  for  a  real¬ 
time  decision  aid,  we  only  consider  a  two  time-step  look 
ahead,  i.e.,  we  set  the  time  horizon  in  SOP  to  two.  We  call 
this  the  two  timestep  look-ahead  problem  (TTLP) .  The 
objective  function  in  TTLP,  which  we  maximize,  is  the 
expected  number  of  target  detections  at  all  arcs  and  nodes 
visited  during  a  given  sequence  of  two  moves  for  all 
searchers.  In  determining  the  aggregate  probability  mass 
for  the  second  time  period,  the  objective  function  assumes 
that  there  are  no  detections  during  the  first  timestep.  The 
TTLP  can  be  solved  optimally  by  total  enumeration,  but  as 
the  number  of  searchers  increases,  the  computational  effort 
increases  exponentially.  As  a  result,  we  constructed  a 
heuristic  algorithm  for  solving  TTLP.  The  heuristic 
algorithm  amounts  to  total  enumeration  of  all  solutions  of  a 
simplified  two  timestep  look-ahead  problem  (STTLP)  which  we 
describe  next.  The  mathematical  formulation  of  STTLP  follows 
in  Section  B. 

STTLP  is  identical  to  the  TTLP  except  that  it  involves 
a  simplified  objective  function.  The  STTLP  objective 
function,  as  in  TTLP,  is  the  expected  number  of  detections, 
but  now  the  expected  number  of  detections  is  computed 
slightly  differently  in  the  second  timestep.  The 
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probability  mass  present  in  the  second  timestep  is 
calculated  for  each  searcher  independently  (with  no 
conflictions  of  moves) ,  only  taking  into  account  probability 
updates  for  that  particular  searcher' s  previous  move  (not 
all  previous  moves  as  in  the  TTLP) .  As  with  the  TTLP,  it  is 
assumed  that  there  are  no  detections  during  the  first 
timestep.  All  states  and  arrays  that  are  relevant  to  this 
update  are  labeled  with  the  superscript  "ND"  (no  detection) . 

The  following  is  an  example  of  the  flow  of  ASOM.  After 
the  initial  states  are  established,  the  searchers  are  given 
starting  locations.  If  there  are  no  initial  detections, 
ASOM  recommends  searcher  moves  based  on  the  STTLP.  For  each 
timestep,  detections  are  entered  and  ASOM  reoptimizes  the 
recommended  searcher  moves  for  the  next  timestep  given  there 
are  no  more  detections.  At  this  point,  the  operator  can 
either  accept  the  recommendations  or  enter  in  alternate 
searcher  moves.  This  process  is  repeated  for  each  timestep 
until  the  search  is  completed. 


B.  DYNAMIC  PROGRAMMING  FORMULATION  OF  STTLP 


For  notational  convenience,  we  use  •  to  denote  the  use 
of  an  array  of  all  the  available  values  for  that  index,  thus 

for  some  values  Xijr  then  Xm  .  ={xx  .,X2  \  . 


Indices 

i,  j ,  k 

m 

t 

u 

b 


Nodes 

Searcher 

Timestep 

Target 

Types  of  targets 
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Sets 


M 

I 

T 

U 

B 

R  a  I  xl 

Q^IxI 

Data 

DISTANCE,  j 
TRANSIT^  j 

SEARCHARC 

m 

SPEEDm 

SWm 

SPEEDTu 

STEP 

START," 


Set  of  all  available  searchers,  m e M  . 

Set  of  all  available  nodes,  i,j,kel. 

Set  of  timesteps,  teT. 

Set  of  targets,  ueU. 

Set  of  target  types,  beB. 

Subset  of  pairs  of  nodes  (i,  j)  representing 
arcs  for  which  there  is  a  road  connecting  i 
to  j,  ( i,j)eR  .  Also,  (i,i)  e  R  ,  Vie/. 

Subset  of  pairs  of  nodes  (i,  j)  representing 
possible  transit  arcs  between  i  and  j,  i,jel. 


Distance  along  road  corresponding  to  arc 
(i,j)  (mi),  (i,  j)  e  R  . 

Straight-line  distance  between  nodes  i  and  j 
(mi),  (i,  j)  e  Q  . 

1  if  searcher  m  searches  a  road  while  on 
transit  arcs,  0  otherwise,  meM  . 

Constant  speed  of  searcher  m  (mph)  ,  meM  . 
Sweep  width  of  searcher  m  (mi),  meM  . 

Speed  of  target  u,  u  eU  . 

Duration  of  timestep  (minutes) . 

Starting  node  of  searcher  m,  meM  . 
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PD 


MATRIX.  . 

I  ij 


TTS 


l,J,U 


THRESHOLD 


TURN 


TYPE, 


Probability  of  detecting  a  target  on  the  road 
corresponding  to  (i,j)  for  searcher  m  given 
that  a  target  is  on  the  road,  ( i,j)eR , 
m  e  M  .  If  i  =  j ,  then  PDijm  =  0  since 
detections  at  a  node  is  determined  by 
function  PDETj  m(r) ,  defined  later. 

Probability  of  a  target  moving  onto  arc  from 
node  i  to  node  j,  i,jel  . 

Target  timesteps  calculation,  the  amount  of 
timesteps  target  u  takes  to  travel  arc 

(/,  J)  ,  (=  60 DISTANCE t  j  /  (( STEP\SPEEDTu )))  ,  (i,  j)  e  R  , 

u  <eU  . 

An  input  threshold  between  0  and  1  to 
determine  what  level  of  target  knowledge  will 
constitute  "knowing"  where  a  target  is. 
Constant  probability  that  a  target  travelling 
along  an  arc  (i,j)  ,  ( i,j)^R  will  turn  around 

and  go  the  other  way. 

The  type  of  target  u,  u  eU  ,  TYPEu  e  B  . 


The  following  decision  variables  are  computed  at  every  time 
t  e  r . 


Decision  Variables  at  Timestep  t 

xijmt  1  if  searcher  m  is  traveling  from  i  to  j,  0 

otherwise . 

ymt  Time  until  searcher  m  completes  the 

recommended  move  (hrs) . 
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1  if  searcher  m  is  searching,  0  otherwise 


^m,t  9 y m,t’ ^ m,t) 


Variable  Array  for  searcher  m. 


Xt=(x...,,y.'„Z'J)r  (=K<) 


Variable  Array  for  all  searchers. 


States  at  time  t 

SEARCHER^  =  {i,zmt  pyra,  ,)7  Vm  ,  where 

i  Current  Location/Destination; 

zmtl  1  if  searching,  0  if  transiting  from  previous 

timestep  (Assume  1  if  t= 1); 

ymt_l  Time  to  completion  of  the  move  from  the 

previous  timestep  for  searcher  m.  (hrs) 
(Assume  0  if  t  =  1  )  . 


MARG, 


Probability  of  target  u  being  on  arc  (i,  j) 
(i,  j)  e  R  ,  u  <=U  ,  t  e  7  . 


MARG: 


agg, = 


ueU\max(MARG.  .  u  ,  )<THRESHOLD 


I  1 

ueU\max(MARG. .  „  ,  )<THRESHOLD 

Aggregate  probability  of  all  targets  being  on 
arc  (i,  j),  (i,  j)  e  R ,  teT. 


St  =  {SEARCHER, , ,  MARG,  ,,t)J 


State  Vector. 
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Functions 

R,{SrXt) 

Reward  for  all  searchers  traveling  between 
node  i  and  j ,  m  e  M  ,  i,jel  . 
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pND  /  rND  y  \ 
^ m,t  \^m,t  9  '  m,t ) 


Reward  for  searcher  m  traveling  between  node 
i  and  j,  meM  ,  i,jel  .  This  function  is 
only  used  in  calculating  the  future  reward 
when  there  is  only  knowledge  of  the  searcher 

m. 


PDETim(z) 

Probability  of  detection  at  node  i  by 
searcher  m,  dependent  on  amount  of  time 
searched,  r  ,  /'  e  /  ,  m  e  M  . 

NEGATIVE.., JS„Xt) 

Function  to  update  probability  maps  for 
failed  detection  via  Bayesian  updating. 
NEGATIVE™  m(St,Vmt) 


Function  to  update  probability  maps  for 
failed  detection  via  Bayesian  updating  for 
look-ahead.  Heuristic  approach  only  takes 
into  account  the  move  of  searcher  m. 


MARKOV. 


i,j  ,u,t 


(St) 


Function  to  update  probability  maps 
target  movement  based  on  Markov  matrix. 


MARKOV 


i,j,u,m,t\  m,t 


(® 


for 


Function  to  update  probability  maps  for  only 
the  movement  of  target  m.  It  is  used  in  the 
calculation  of  the  "no  detection"  marginals 
according  to  searcher  m. 


POSITIVE 


i,j,u,t 


Function 

positive 


to  update  probability  maps 
detection  via  Bayesian  updating. 


for 
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Policy:  Set  Xt  =  X* ,  where  (X* ,X*+])  is  the  optimal  solution 

of  the  simplified  two  timestep  look-ahead  problem  (STTLP) : 


max  + 


Subject  to: 


X  Xi,j,m,t+  X  XiJ,m\,+ 1  ^  1  V7 


(Do  not  allow  overlapping  of  moves) 


X*.\V>,/  -  '  V/’  / 


(Max  one  searcher  per  arc  at  time  t) 


Xw*1  yi’J' 


(Max  one  searcher  per  arc  at  time  t  +  1' 


X  Xi,j,m,t  —  1 


(One  move  per  searcher  at  time  t) 


XX/,./>M+l  -  1  VW? 


(One  move  per  searcher  at  time  t  +  1) 


If  searcher  m  is  at  node  i  at  time  t,  then 


XXv>,r=1  V/w 


(Must  start  at  the  starting  position) 


End  if 


X  XiJ,m,t  *  Zm,t 


(Tracks  transiting/searching  at  time  t) 


^i,j,m,t+ 1  Zm 


(Tracks  transiting/searching  at  time  t  +  1) 
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Else  If  t> 2,  then: 

' DISTANCE,  jZmJ  +  V 
'^[TRANSIT, 

SPEED, STEP  /  60 

(Keeps  track  of  timesteps  until  searcher  m  is 
available) 

End  If 

I0’1}  Si,j,m 
xi,j,m,t+\  Vi,  j,  m 

e  {°’1}  Si,j,m 

e  I0’1} 

ym  t  >  0  Vm 

ym,t+ 1  ^  0 
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Dynamics  (Given  St  and  X,) 

Vj,m,  if  2X;>;>0'  then: 

i 

SEARCHER, „+1  =  (j,zmt,yml)T 

Sets  the  searcher' s  state  to  the  decisions  of 
that  searcher  for  this  timestep. 

End  If 


MARG.hU'M  =  MARKOV;  jut  (NEGATIVE,  jut  (POSITIVE, ,ut (St ,  D,,,t),  Xt ))  Wi,j,u 

Updates  the  target  marginals  for  the  positive 
detection  updates,  the  negative  detection 
updates,  and  the  movement  of  the  targets 
based  on  the  Markov  process. 


=  MARKOV™ 


(NEGATIVEm  (POSITIVE, ,ut(S„ ZEROS. . . ), Vm t ))  Mi,j,u,m 


ZEROS  denotes  a  matrix  of  zeros  as  input  for 
the  detection  matrix,  or  "no  detections 
found"  in  human  input  terms.  The  update  only 
has  knowledge  of  one  searcher  at  a  time,  thus 
it  calculates  marginals 
St+1  =  (SEARCHER,  l+vMARG.„,+l)T 

S™+1  =  (SEARCHERmt+1,MARG^mt+l)T 

Sets  the  regular  and  no  detections  state 
variable  arrays. 
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III. IMPLEMENTATION 


A.  MODEL  IMPLEMENTATION 

We  implement  the  model  in  MATLAB  version  7.0.1  and 
carry  out  all  computational  tests  on  a  NES  computer  with  a 

I. 83  gigahertz  AMD  Athlon  XP  processor  and  512  megabytes  of 
RAM.  As  described  earlier,  we  implemented  a  heuristic 
solution  to  the  TTLP,  called  STTLP.  The  code  is  written  in 
many  sub-functions  so  that  a  single  aspect  of  ASOM  can  be 
changed  without  having  to  go  through  the  entire  code.  The 
descriptions  of  our  MATLAB  functions  are  given  in  Appendix 

II . 

B.  HEURISTIC  ACCURACY 

The  only  straightforward  method  for  ensuring  that 
optimal  searcher  moves  are  chosen  is  total  enumeration.  The 
difficulty  with  total  enumeration  is  that  for  every  searcher 
added  to  the  TTLP,  the  total  number  of  searcher  move 
combinations  increases  exponentially.  Thus,  we  need  the 
heuristic  algorithm,  STTLP  (see  section  B  in  Chapter  II)  . 
We  compare  our  heuristic  with  the  total  enumeration  approach 
in  terms  of  runtime  and  accuracy  to  ensure  it  provides 
effective  recommendations  and  that  its  speed  improvements 
are  worth  sacrificing  optimality.  For  one,  two,  and  three 
searchers  we  create  random  target  marginals,  randomly  place 
the  searchers,  and  compare  the  moves  recommended  by  our 
heuristic  and  total  enumeration  functions.  We  allow 

searchers  to  be  "initially  blocked"  with  a  probability  of 
.25.  Here,  "initially  blocked,"  means  that  the  searchers 
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are  constrained  in  their  movements  from  the  previous 


timestep  (i.e.,  still  in  transit).  This  .25  probability 
represents  the  fact  that  during  a  normal  run  of  our  decision 
aid,  the  searchers  make  direct  transits  that  require  two 
timesteps  and  are  blocked  from  making  a  new  move  for  one 
timestep . 

Table  1  shows  the  accuracy  results  of  the  heuristic  for 
1000  simulation  runs.  The  accuracy  is  a  ratio  of  the 
probability  mass  collected  by  the  heuristic  versus  that 
collected  by  the  total  enumeration  approach.  It  also 
displays  the  fraction  of  time  the  heuristic  returns  the 
optimal  move.  The  "Within  One  Move  of  Optimal"  column  gives 
the  fraction  of  time  that  the  heuristic  moves  did  not  match 
up  with  the  total  enumeration  moves  for  at  most  one 
searcher.  Table  2  displays  the  runtimes  of  the  heuristic 
and  total  enumeration  approaches  for  one,  two,  and  three 
searchers  along  with  their  95%  confidence  intervals. 


Table  1.  Heuristic  Accuracy  Table. 


Number  of  Searchers 

Accuracy 

Returns  Optimal  (TTLP)  Move 

Within  One  Move  of 
Optimal  (TTLP) 

1 

1 

1 

1 

2 

0.9914 

0.944 

0.985 

3 

0.9813 

0.843 

0.934 

Table  2.  Heuristic  Runtime  Table. 


Number  of  Searchers 

STTLP  Runtime  (sec) 

■1113111111 

1 

.02165  +/- .00074 

.01462  +/-  .00074 

2 

.04219+/-  .00093 

.8560  +/-  .046 

3 

.07381  +/-  .0076 

64.21  +/-  4.45 

4 

.1046  +/-  .00227 

4186  (estimated) 
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c. 


EXCEL  INTERFACE 


The  Microsoft  Excel  Interface  was  developed  by  Mr. 
Anton  Rowe.  Figure  4  is  an  example  of  the  output  display  in 
the  user  interface . 


Figure  4.  Screenshot  of  Excel  Interface. 


ASOM 

Time 

0 


\  Dashboard  /  Searchers  /  Detections  /  Targets  /  Positions  /  Roads  /  Options  /  Road  Distances  /  Direct  Distances  /  Target  Movemer 


Reset 

Recommend 

Move 


Searchers 


Detections 


Positions 


Options 


Dashboard 


In  Figure  4,  the  red  circles  represent  all  possible 
nodes  and  the  red  triangles  represent  all  possible  roads. 
The  different  sizes  of  the  circles  and  triangles  represent 
the  aggregate  probability  of  finding  targets  there.  The 
solid  blue  boxes  represent  the  different  searchers  at  their 
current  locations  in  this  scenario.  The  blue  lines  and 
outlined  boxes  represent  the  recommended  searcher  moves  for 
the  current  timestep.  If  a  triangle  is  encased  in  the 
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outline  of  a  blue  box,  this  means  the  recommendation  is  to 
search  the  road  to  the  corresponding  node.  A  dotted  blue 
line  going  straight  to  a  node  means  transit  directly  to  that 
node.  If  there  is  an  outline  of  a  blue  box  in  the  middle  of 
a  transit  route,  this  means  the  searcher  will  not  get  to  the 
designated  node  in  one  timestep  and  thus  it  is  a  directed 
move  for  the  following  timestep  as  well.  If  a  searcher  is 
stationary  (zero  speed)  then  the  recommended  move  will 
always  be  to  stay  at  the  same  location,  shown  by  the  blue 
outline  around  its  current  position.  In  the  example  above. 
Raven  is  transiting  from  node  3  to  6,  but  will  take  two 
timesteps  to  reach  node  6.  Buster  is  searching  the  road 
from  node  2  to  node  8  (one  timestep)  and  Scan  Eagle  is 
transiting  from  node  11  to  node  9  (one  timestep) . 

There  are  several  required  inputs  for  ASOM  including 
parameters  for  both  searchers  and  targets.  For  each 
available  searcher,  the  name  (as  it  will  be  displayed  on  the 
interface)  should  be  provided,  as  well  as  the  speed, 
sweepwidth,  a  binary  entry  for  whether  the  UAV  has  a 
moveable  camera  capable  of  searching  roads  while  flying 
straight  line  distances,  and  the  starting  position.  An 
example  input  is  seen  in  Figure  5.  Notice  there  is  also  a 
stationary  searcher  in  the  scenario  below,  which  is  input  by 
a  searcher  with  speed  equal  to  zero.  A  starting  position 
must  also  be  provided,  but  the  "Sweep"  and  "Arc"  categories 
for  a  stationary  searcher  are  not  used. 
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Figure  5 . 


Example  Searcher  Input. 
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The  available  targets  are  simple  inputs  of  the  expected 
number  and  type  of  each  target  that  will  be  available  in  the 
scenario.  For  each  target,  a  speed  and  type  must  be 
provided,  as  seen  in  Figure  6.  If  the  number  of  targets  is 
not  known,  a  reasonable  estimate  should  be  provided;  the 
better  the  estimate  the  more  accurate  the  model  will  be. 


Figure  6.  Example  Target  Input. 


Detections  are  input  during  the  current  timestep  of  a 
model.  The  key  feature  here  is  the  "Recommend"  button. 
When  pushed,  this  button  gives  recommendations  based  on  the 
current  state.  If,  however,  detections  are  made  between 
then  and  the  end  of  the  timestep,  they  can  be  inputted  to 
update  the  state  and  a  new  set  of  moves  will  be  outputted. 
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An  example  timeline  of  entering  detections  and  moving 
targets  can  be  seen  in  Figure  7 . 


Figure  7.  User  steps  in  ASOM. 


Detections  are  inputted  with  four  parameters:  (i) 
timestep  of  the  detection,  (ii  and  iii)  perceived  starting 
node  and  ending  node  location  of  the  target,  and  (iv) 
detection  type.  The  starting  and  ending  node  location 
together  represent  the  arc  (i,j)  (location)  in  which  the 
target  was  detected,  where  if  i=j,  the  target  was  detected 
stationary  at  node  i;  and  if  i^j,  the  target  was  detected 
on  the  road  going  from  node  i  to  node  j  .  An  example  of 
what  the  target  detection  sheet  might  look  like  at  timestep 
5  can  be  seen  in  Figure  8.  In  this  example,  the  first  line 
says  there  was  a  detection  of  type  1  on  the  road  from  node  2 
to  node  8  at  time  1.  Similarly,  the  second  line  says  there 
was  a  detection  of  type  2  stationary  at  node  5  at  time  3. 
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Figure  8 . 


Example  Target  Detections. 


Additional  data  for  ASOM  include  the  latitude/longitude 
of  the  nodes,  data  for  the  roads  ( start / finishing  nodes, 
length  of  the  road,  and  latitude/longitude  position  to 
display  the  red  triangle  representing  probability) ,  direct 
distances  between  nodes  (as  a  UAV  can  fly  them)  ,  and  the 
Markov  movement  matrix  for  each  target. 
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IV.  FIELD  EXERCISES 


We  performed  two  field  experiments  in  February  and  May 
2008  at  Camp  Roberts,  California  using  multiple  Raven  and 
Buster  UAVs . 

An  important  part  of  ASOM  is  the  ability  to  take  into 
consideration  the  needs  of  the  operator  and  the  possibility 
to  react  to  unexpected  situations.  Several  features  of  ASOM 
would  not  exist  if  we  did  not  field  test  the  decision  aid 
and  receive  feedback  from  UAV  operators.  This  allows  ASOM 
to  handle  realistic  scenarios  in  multiple  environments. 

A.  FEBRUARY  EXPERIMENT 

The  purpose  of  the  February  experiment  was  to  test  a 
preliminary  version  of  ASOM  and  make  sure  the  results  passed 
a  reality  check.  A  secondary  purpose  was  to  see  what  could 
be  improved  in  the  underlying  code  and  what  changes  were 
necessary  to  make  ASOM  run  smoother.  There  were  several 
weather  restrictions  that  limited  the  experiment,  but 
overall  the  objective  of  the  experiment  was  accomplished. 

We  ran  our  preliminary  model  with  5  moving  targets 
(cars)  traveling  at  25  miles  per  hour  and  three  searchers: 
one  ground  team,  one  Raven  UAV,  and  one  Buster  UAV.  ASOM 
isolated  the  possible  location  of  the  targets  to  one  side  of 
the  map,  as  seen  in  Figure  9,  and  was  correct  in  its 
judgment  of  possible  target  locations.  In  this  preliminary 
version  of  the  model,  aggregate  probability  is  given  by  a 
color  scale  rather  than  a  size,  with  green  representing  the 
lowest  probability,  fading  to  yellow,  then  finally  to  red 
representing  the  highest  probability.  The  nodes  are  still 
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represented  by  circles,  but  the  roads  are  represented  by 
straight  lines  between  the  nodes. 


Figure  9.  February  Experiment  Final  Probability  Map. 


There  were  several  important  lessons  learned  from  this 
experiment.  The  first  stemmed  from  the  fact  that  our 
approach  was  greedy  in  its  search  patterns.  At  this  point, 
the  searchers  appeared  to  find  a  target  and  track  it  because 
this  resulted  in  the  largest  reward  while  sacrificing 
knowledge  of  the  other  targets.  This  is  not  optimal  if  the 
objective  is  to  maximize  total  knowledge  of  the  system.  We 
remedied  this  by  creating  the  threshold  input.  As  described 
earlier,  this  is  equivalent  to  saying  you  "know"  where  a 
target  is  located  if  its  maximum  probability  mass  at  any 
location  is  greater  than  the  threshold  probability. 
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Another  change  in  ASOM  was  how  to  make  the  model  more 
user-friendly  than  the  current  MATLAB  code  and  input 
techniques.  This  was  handled  with  a  new  Excel  interface  as 
discussed  in  the  previous  chapter.  The  usefulness  of  the 
interface  is  discussed  in  the  May  experiment  section. 

B .  MAY  EXPERIMENT 

The  goal  of  the  May  experiment  was  to  test  the  updated 
code,  which  included  the  target  threshold  constraints  to 
discourage  a  greedy  policy  which  tracked  detected  targets. 
We  implemented  the  Excel  interface  for  the  first  time  and 
evaluated  its  utility  and  functionality.  The  experiment  was 
run  with  four  targets  (again,  cars  traveling  at  25  miles  per 
hour)  and  three  searchers,  one  Buster  UAV  and  two  Raven 
UAVs  . 

The  first  day' s  trials  led  to  the  creation  of  the 
disabled  node.  This  node  is  an  abstract  location  where 
searchers  are  placed  when  they  are  refueling,  damaged,  or 
unusable.  This  allows  ASOM  to  function  in  a  larger  set  of 
scenarios  as  well  as  take  into  account  unexpected  events 
where  a  UAV  becomes  disabled.  For  example,  in  the  first 
trial,  the  Buster  UAV  lost  contact,  deployed  its  parachute, 
and  was  unable  to  continue  its  search.  The  Raven  UAVs  also 
ran  out  of  gas  sooner  than  expected  and  had  to  land  and 
refuel,  thus  cutting  the  experiment  runs  short. 

The  second  day's  trial  utilized  the  disabled  node 
update.  This  trial  was  extended  to  a  nearly  three  hour 
scenario  where  UAVs  were  forced  to  refuel,  thus  testing  the 
capabilities  of  the  disabled  node. 
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Figure  10  shows  the  locations  of  all  of  the  targets  and 
searchers  as  well  as  the  color  of  the  vehicles  detected. 
The  green  and  tangerine  colored  boxes  represent  actual 
target  detections  by  the  searchers.  Yellow  boxes  represent 
possible  failed  detections,  meaning  the  timing  of  the 
searcher  or  target  leaving  and  the  other  arriving  on 
location  were  close,  but  there  could  have  been  a  failed 
detection.  A  red  box  means  a  target  and  a  searcher  were 
each  at  the  same  location,  but  there  was  no  detection  made 
at  that  time.  From  this,  we  calculated  an  estimate  of  the 
probability  of  detection  with  appropriate  95%  confidence 
interval  (0.46  +/-  0.20) .  Since  the  data  set  is  relatively 
small,  the  confidence  interval  on  the  probability  of 
detection  is  very  wide.  In  any  case,  this  might  give  us  a 
better  estimate  on  the  actual  probability  of  detection  for 
these  UAVs .  In  ASOM,  the  probability  of  detection  is 
derived  from  the  random  search  formula  and  is  dependent  on 
time  as  well  as  searcher  characteristics,  but  it  is 
generally  higher  than  the  above  empirical  estimate. 
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Figure  10. 


May  Experiment  Detection  Results. 
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Failed  detections  could  stem  from  any  combination  of 

three  sources  of  error.  The  searchers  were  at  incorrect 

locations,  the  targets  were  at  incorrect  locations,  or  our 

estimation  of  the  probability  of  detection  for  searchers 

finding  targets  was  inaccurate.  The  problem  of  searchers 
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being  at  wrong  locations  seems  unlikely  because  they  are 
given  GPS  coordinates  to  fly  to,  and  their  locations  are 
displayed  on  a  screen.  It  is  possible  the  targets  (who  were 
people  driving  around  in  cars)  did  not  know  the  Camp  Roberts 
map  as  well  as  we  had  hoped  and  were  actually  driving  to 
wrong  locations.  The  most  likely  source  of  error  was  that 
the  camera  feeds  on  the  UAVs  were  scrambled  enough  that  the 
operators  had  a  hard  time  identifying  targets,  thus  lowering 
our  probability  of  detecting  a  target  given  a  searcher  and 
target  were  at  the  same  location. 

One  other  interesting  aspect  of  having  a  long  trial 
versus  several  short  trials  is  a  measurement  of  the 
situational  awareness  of  the  searchers.  Specifically,  the 
awareness  of  target  location  went  in  cycles.  Examining 
Figures  11  and  12,  the  first  is  a  picture  showing  UAV 
locations  and  target  location  probabilities  half  way  through 
our  second  day's  trial.  The  searchers  appear  to  have  locked 
onto  the  locations  of  the  four  targets.  The  second  figure 
shows  the  end  of  the  scenario  where  the  searchers  have  some 
idea,  but  not  as  good  as  the  previous  screenshot.  This 
shows  that  searcher  knowledge  of  target  location  went  in 
cycles;  the  searchers  had  the  targets  pinned  down,  then  the 
probability  mass  spread  out,  and  eventually  the  searchers 
would  pin  down  the  targets  again.  This  could  also  be 
explained  by  a  high  estimate  of  the  probability  of  detection 
because  it  would  eliminate  too  much  mass  from  a  location 
that  was  just  searched  when  there  should  still  be  a 
significant  probability  mass  at  that  location.  If  this 
estimate  were  lowered,  it  would  take  longer  for  the 
searchers  to  isolate  the  target  location,  but  it  would  be 
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more  accurate  and  unlikely  to  go  through  the  cycle  of  target 
knowledge  that  was  experienced  in  this  trial. 

Figure  11.  Mid-Scenario  Probability  Map. 
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Figure  12 . 


May  Experiment  Final  Probability  Map. 


The  second  day's  trial  was  markedly  improved.  The 
small  problems  we  experienced  in  day  1  were  fixed  for  day  2 
and  the  long  trial  ran  smoothly.  During  the  trial,  the  UAVs 
operated  without  any  mishaps.  The  disabled  node  was  used 
for  refueling  purposes  and  worked  according  to  plan.  The 
results  from  day  2  were  informative  and  the  Excel  interface 
made  ASOM  easier  to  understand,  even  for  the  people 
observing  the  experiment.  After  implementing  the  target 
thresholds,  the  searchers  were  able  to  concentrate  their 
efforts  on  finding  targets  whose  location  probabilities  were 
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spread  out.  The  behavior  of  the  searchers  when  they  did  not 


concentrate  on 
resulted  in 
awareness  when 
these  updates, 
future  work  on 


searching  nodes  with  recently  found  targets 
a  noticeable  improvement  of  situational 
compared  to  the  greedier  ASOM.  Even  after 
there  are  still  a  few  recommendations  for 
ASOM. 
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V.  FINAL  THOUGHTS 


A.  CONCLUSIONS 

We  have  created  a  decision  aid  that  recommends 
efficient  search  plans  for  multiple  UAVs  searching  for 
multiple  moving  targets,  possibly  of  different  types.  This 
decision  aid  demands  few  assumptions  concerning  the  desired 
search  scenario.  ASOM  is  general  enough  to  support  many 
military  or  civilian  search  situations.  It  can  be  used  to 
search  for  terrorists  moving  between  safe-houses  and 
friendly  pilots  who  have  been  shot  down  in  a  wooded  area. 
On  the  civilian  side,  it  could  be  used  for  search  and  rescue 
missions  after  natural  disasters  or  to  search  for  lost 
hikers  in  the  mountains.  ASOM  can  also  incorporate 
stationary  searchers  or  targets  and  can  even  keep  track  of 
different  types  of  targets.  The  decision  aid  is  capable  of 
being  altered  for  a  greedy  search  to  keep  track  of  targets 
once  they  are  found,  or  to  go  after  other  targets  that  have 
not  been  found  in  a  while,  or  at  all. 

Today,  UAVs  are  increasingly  used  in  combat  situations. 
Their  importance  in  future  warfare  will  continue  to  grow  and 
they  are  likely  to  become  more  important  in  many  different 
civilian  applications.  Creating  efficient  search  plans  for 
these  UAVs  is  the  problem  we  chose  to  solve,  but  there  are 
many  other  topics  involving  efficient  UAV  routing.  There  is 
a  necessity  for  work  such  as  that  seen  in  this  thesis  and 
the  importance  of  such  work  guarantees  many  different 
avenues  for  future  research  in  this  area. 
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B. 


FUTURE  WORK 


Currently,  there  are  several  aspects  of  ASOM  that  could 
be  improved.  Firstly,  we  did  not  take  the  wind  speed  and 
direction  into  account  when  determining  flight  times  for 
UAVs  to  reach  destination  nodes.  This  update  would  involve 
creating  a  dynamic  set  of  distance  matrices  that  vary  with 
wind  speed  and  direction.  This  will  make  the  calculations 
of  arrival  and  search  times  far  more  accurate  than  the 
constant  distances  that  we  used  in  the  calculations.  While 
the  wind  factor  is  a  relatively  simple  change  to  the  model, 
it  will  dramatically  increase  the  accuracy  based  on  the 
amount  of  work  required. 

The  second  change  would  be  to  do  some  more  calculations 
and  experiments  to  get  better  estimates  on  the  probability 
of  detection  for  different  UAVs.  The  values  we  used  were 
estimated  on  past  experience,  but  we  believe  them  to  be  too 
high  of  an  estimate.  If  more  research  was  completed  and 
better  estimates  found,  again  the  accuracy  of  the  model 
would  be  increased  with  a  relatively  small  amount  of  work 
required,  albeit  somewhat  time-consuming. 

The  third  change  would  require  a  bit  more  programming 
experience,  but  in  the  end,  could  create  the  most  accurate 
decision  aid.  This  change  would  be  to  try  and  do  more  than 
the  two-step  look-ahead  problem.  Not  to  look  further  into 
the  future,  but  to  create  an  expected  future  reward  based  on 
the  current  state  after  the  two-step  look-ahead.  This  would 
be  a  way  of  estimating  any  further  look-ahead  based  on  the 
state  as  there  are  diminishing  returns  on  looking  further 
into  the  future  and  the  computation  time  increases  rapidly. 
This  expected  reward  on  future  searches  based  on  the  state 
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is  a  good  way  to  avoid  the  problem  of  computational 
complexity,  yet  get  a  more  accurate  solution. 

A  fourth  possible  change  would  be  to  try  and 
incorporate  target  dependence  into  the  model.  Currently, 
the  model  assumes  independent  movement  of  the  targets.  This 
assumption  makes  computing  the  marginals  based  on  movement 
from  the  Markov  process  easier  than  if  the  targets' 
movements  were  dependent  on  one  another.  Getting  rid  of 
this  assumption  would  be  a  somewhat  difficult  task  as  that 
part  of  the  updating  phase  would  have  to  be  reconstructed, 
but  it  would  be  a  great  way  to  extend  our  work  on  ASOM. 

An  extension  to  include  different  scenarios  is  to 
examine  the  possibility  of  tracking  criminals  after  a 
robbery  along  city  streets.  In  this  scenario,  searchers 
would  first  concentrate  their  search  around  the  robbery 
location,  but  as  time  increases  the  graph  of  nodes  and  arcs 
would  be  forced  to  expand  to  represent  the  criminals  getting 
away.  There  could  even  be  an  "escape  node"  to  represent  the 
criminals  getting  out  of  the  area  or  exceeding  the  time  the 
police  are  willing  to  search. 
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APPENDIX  I:  ADDITIONAL  EXPRESSIONS  FOR  FORMULATION 


Random  Variables  and  Sets 

f  Random  variable  with  a  uniform  (0,1) 

distribution . 

The  following  random  data  sets,  DETt ,  Ct ,  and  COMBOdct  are 

used  in  the  calculation  of  target  detections.  ASOM  receives 
all  of  the  detections  as  inputs  during  time  t.  ASOM  must 
then  determine  the  probability  of  each  different  possible 
scenario  of  detections  occurring  as  explained  in  the  model 
formulation.  These  calculations  are  handled  by  appropriate 
functions  below,  these  are  the  random  sets  required  for 
those  calculations. 


DETb't 


C, 


b,t 


COMBO d 


,c,b,t 


Set  of  the  number  of  detections  of  type  b  at 
time  t,  beB  ,  teT,  DETbt  =  {1,2,. ,.,^Dj  jb t}  , 


hj 


d  e  DETb  t . 


Set  of  the 
of  target 

b  e  B  ,  t  eT  , 


number  of  different  permutations 
detections  of  type  b  at  time  t, 

<t,  =  (1-  2,  Id !/  (Id  -  \detu  I) !} ,  c  e  Cb  !  . 


Matrix  of  the  different  permutations  of 

target  detections  of  type  b  at  time  t. 
Detection  number  d  of  permutation  number  c 
during  time  t,  ceCbt,  deDEThl,  beB  ,  teT. 
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Model  Formulation  Functions 


Rt{S„Xt)=Y4(AGG] 


iJ;XiJ,m4Zrn,,PDi,j,m+AGGjJ'tXiJnutPDETjm 


(max(0,  STEP  -  ym  t ))) 


i,j,m 


Reward  for  all  searchers  traveling  between 
node  i  and  j,  meM  ,  i,jel  .  The  reward 
function  is  an  important  part  of  the  model 
because  it  is  what  the  model  intends  to 
optimize  by  changing  the  possible  decision 
variables . 

i,j 

Reward  for  searcher  m  traveling  between  node 
i  and  j,  meM,  i,jel  .  This  is  the  function 
used  for  the  future  reward  where  the  state 
will  depend  on  the  previous  moves  of  just  one 
searcher . 

-tSPEED,„SW„, 


PDETt  (t)  =  l-e 


n. 0625 


Probability  of  detection  at  node  i  by 
searcher  m,  dependent  on  amount  of  time 
searched,  r  ,  iel,  meM  .  This  is  the 

function  used  to  determine  probability  of 
detection  at  a  node,  rather  than  on  a  road 
(handled  earlier  in  the  data  section) . 
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TOTALu(S„Xt) 


i,j,m 


MARG  iM  I.x,  ,  ( 1  -  PDt  ... )  • 

MARC,  ,  „  (l  -  PDETj  m (max(0,  STEP 


1 


A 


max(MARG"U  t)  <  1 
otherwise 


Vw 


Sub-function  of  NEGATIVEi  . ut(St,Xt)  and 
NEGATIVEnd  ,{St,Vmt)  .  It  represents  the 


normalizing  factor,  meaning  it  is  the  sum  of 
all  the  posterior  probabilities  after  a 
Bayesian  update.  If  the  variable  array  input 
is  for  a  one  searcher  m  (as  with  the  input 
Vmt)  ,  the  summation  over  variable  m  is  only 
over  the  single  input  value  m. 


“rev-(|-n(|-pA„)) 

TOTALu{SnXt) 

NEGATIVE,.  ul(S„Xl)  =  < 

r  \ 

margu.U',\ 

1  -  n  ( 1  -  PDET um  (max(l 0,  STEP  -  ymJ ))) 

V  m  J 

TOTAL, XSnXt) 

Vi,j,u 


Function  to  update  probability  maps  for 
failed  detection  via  Bayesian  updating. 
Takes  the  posterior  probabilities  and 
normalizes  by  dividing  by  the  sum  of  all 
posterior  probabilities. 


MARGw[l-U(l-PDtJ^ 

NEGATIVE™  m(St,Vml)  =  < 

TOTAL„(St,Vmt) 

/  \ 

MARG,JUJ  (l-no- PDET.m  (max(0,  STEP  -  ymJ)))\ 

TOTALu{S,,Vmt) 

i  =  j 


ViJ,u 


Function  to  update  probability  maps  for 
failed  detection  via  Bayesian  updating  for 
look-ahead.  Heuristic  approach  only  takes 
into  account  the  move  of  single  searcher  m. 
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TEMPI.  juJ  (St)  =  MARGt  julMA  TR1X, , .  V/,  j,  u 

Sub-function  of 


MARKOV.  .  t(St) 


and 


MARKOV^umXS™)  .  It  represents  the  probability 

that  a  target  at  node  i  will  remain  at  node  i 
for  the  next  timestep. 


„  MARGiiut 

TEMPla .  .  (S. )  =  T - 

J”  femax(rra,.H,l) 


Vy,  u 


TEMPI  (S,)  = 


Sub-function  of  TEMPI .  .  u  \)  .  It  represents 

the  additional  probability  each  node  will 
accumulate  for  the  next  timestep  by  the  mass 
coming  in  from  all  adjacent  roads. 

[  TEMPla  ^t(St)  i  =  j 

1  0  i*j 


Sub-function  of  MARKOVj  .  u  ,(5,)  and 

MARKOV™umt{S ™)  .  It  extends  the  previous 

function,  TEMPla .  u  t(S ,)  ,  to  account  for  the 

fact  that  only  nodes,  not  arcs,  have  this 
property . 


TEMP\.uJ{S,)  = 


(1  -  TURN)  max(  7Td(  „  - 1, 0 )MARG. 


i,j,u,t 


max(TTS  0) 


V/,  j,  u 


Sub-function  of  MARKOVi .  ul(St)  and 

MARKOV^ mt(S™)  .  It  represents  the  probability 
of  target  on  arc  (i,j)  deciding  to  continue  on 
that  arc  with  (l  — TURN)  probability. 
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TEMP4..  ,($,)  = 


TURNmax(TTSj  i  u  - 1,0 )MARGj  i  l 
max(TTSjiu ,  0) 


Sub-function 


\/i,j,u 


MARKOV..  ASt) 


M4RKOVijumt(Sml)  .  It  represents  the  probability 
of  a  target  on  arc  (i,j)  deciding  to  turn 
around  with  TURN  probability. 


MARKOV.,  (S,)  = 


TEMPI  ijul(S t)  +  TEMP2ijul(St)  i  =  j 

TEMPI..  U!t  (S,)  +  TEMP2>i  jut  (S,)  +  TEMPA.uJ  (St)  i*j 


Vi,j,u 


Function  to  update  probability  maps  for 
target  movement  based  on  Markov  matrix.  It 
incorporates  all  sub-functions  to  take  into 
account  for  all  probability  mass  leaving  and 


MARKOV*"  (S^)  = 


coming  into  arc  (i,j)  . 

TEMP\i  j  u  I(S™)  +  TEMP2i  j  u  t (S^t )  i  =  j 

TEMPI  (®  +  TEMPI,  (S%) + TEMP4  (®  i*j 


Function  to  update  probability  maps  for 
target  movement  based  on  Markov  matrix  for 
the  second  step  look-ahead.  It  incorporates 
all  sub-functions  to  take  into  account  for 
all  probability  mass  leaving  and  coming  into 
arc  (i,j)  •  The  only  difference  between  this 
function  and  the  normal  MARKOV  function,  is 
that  this  one  is  performed  for  each  searcher 
m,  and  the  current  input  of  the  state  will 
only  be  updated  for  the  moves  of  searcher  m. 
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PRliM{St,D.^t)  = 


^  MARGi  J  COMBOd  bj  t 

i  j~  1  i  j 

^  ^  A;  1,1 2,6,1  <  d  —  ^  A  1,1 2, 6,1 

*1=1  k  2=1 

*1=1 1-2=1 

1 

otherwise 

Sub-function 

of 

PR2iUJ(St,D..At)  . 

Vi,j,d,c 


It 


calculates  the  probability  of  seeing  a 
particular  target  of  a  particular  combination 
c  for  detection  d  of  that  combination  for 
each  arc  (i,j)  . 

«,,W(S, ,  d..m  )  =  n  « .  A,a,  )  v,;  j,  c 

d 

Sub-function  of  PR3ct(St,D^hl)  .  It  determines 

the  total  probability  of  seeing  all 
detections  of  a  particular  combination  for 
each  ar c(i,j)  . 

(S, ,  D.,tJ )  =  n  PR\,M(S, .  d..m)  Vc 


Sub-function  of  PR4cl(Sl,Dtmhl)  .  It  determines 

the  total  probability  (no  normalization)  of 
seeing  each  combination  of  target  detections 
by  multiplying  over  i  and  j. 


PR\ASnD.,At)  = 


PR3iJ,c,uASt’D.,.At) 


Zpia. 

cleCt 


i,j,c\,u 


AS„D,t) 


Vc 


Sub-function  of  CHOICEd ct(St,D,mbt)  .  It 

normalizes  the  calculation  of  PR3C t(St,D.,hl)  . 
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CHOICE, ct(St, D.,bl) ■■ 


COMBO. w  ZPR3cASt, D.^t)  <C<  YpRIuJSiJ\.j<j) 


\/d,c 


i  0  otherwise 

Sub-function  of  CHOICE!  d  t{St,D%  ,bt) ,  C,  denotes 

a  random  number  drawn  from  a  uniform (0,1) 
distribution.  It  determines  the  actual 

scenario  of  target  detections  that  occurred 
according  to  the  model  based  on  this  random 
draw  and  the  probabilities  of  each  scenario 
occurring  by  setting  all  other  combination 
values  to  0. 


CHOICE!  dt{S,,D%'bt)  =  ^ 

C 


CHOICE d  (S, ,  D  b  ) 


w 


Sub-function  of  POS\t  . dut(St,D,,bl)  .  It  gets  rid 

of  all  other  combinations  except  the  values 
of  the  one  that  actually  occurred. 


POS\iM{St,D.^t)  =  \ 


1 


f  i  M  i  j 

S  C*k\,k2,b,t  <  d  ~  S  ^k\,kl,b,t 

\k\=\  A- 2=1 


£1=1  £2=1 


&& 


(> ,  =  CH01CE2tJ(S„D.,tJ )) 

otherwise 


\/i,j,d,u 


Sub-function  of  POS2i  j u  ,hl)  . 

the  value  1  for  all  locations 
detection  occurred  at  arc  (i,j)  for 
and  detection  d  and  zero  otherwise. 
POS2w{S„D.,t,)  =  Zposku„As,-D.M)  ViJ.u 


It  stores 
that  a 
target  u 


d 

Sub-function  of  POSTYPEl.  jub,(St,Dmttt)  .  It  sums 
over  the  detection  number  variable  so  we  have 
a  1  if  a  detection  occurred  on  arc  (i,j)  for 
target  u. 
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POSTYPE\u.u^(St,D.^)  = 


MARG; 


l,J,U,t 


n’j le/  \fi,j,u,b 

POS2u.uj (st >  A,./,,)  otherwise 


Sub-function  of  P0STYPE2j Jubt(St,D,,,t)  .  It 

sets  the  value  equal  to  the  marginal  value  if 
no  detection  occurred  and  1  if  a  detection 
did  occur  (to  spike  the  probability)  for  each 
target  type  b. 


POSTYPE2  .  .  (St ,  D  )  = 


|  POSTYPE\ijubt  (St ,  D.,.t )  TYPEu  =  b 
\  0  otherwise 


Vi,j,u,b 


Sub-function  of  POSITIVEjjul{Sl,D%^l )  .  It  sets 

the  target  marginals  to  the  correct  values 
only  if  the  current  target  being  looked  at, 
u,  matches  the  current  type,  b. 


POSITIVE,  jut  (S, ,  D'"t )  =  ^  POSTYPE2i  jubt  (St ,  Dm,,t ) 


Vz,y,M 


b 

Function  to  update  probability  maps  for 
positive  detection  via  Bayesian  updating.  It 
sums  over  the  probabilities  for  different 
types  of  targets. 


54 


APPENDIX  II:  MATLAB  FUNCTION  DESCRIPTIONS 


A.  STEP .M  FUNCTION 

This  function  is  the  main  workhorse  that  runs  the 
algorithm.  It  does  all  calculations,  either  inside  the 
function,  or  calling  other  functions  to  do  the  work  for  it. 
It  first  updates  the  target  marginals  by  running  the 
positive  Bayesian  updates  (detections)  for  different  target 
types  (PositiveBayesianPermutations .m) .  The  function  then 
makes  all  essential  updates  to  the  probability  of  detection 
at  each  arc  (i,j) ,  including  the  nodes  and  connecting  arcs 
for  any  stationary  searchers  using  the  locations  of  each 
searcher.  After  these  steps,  the  function  updates  the 
target  marginals  for  negative  Bayesian  updates 
(NegativeBayesian . m)  ,  the  traditional  application  of  Bayes' 
theorem.  Next,  the  function  updates  for  target  movement 
from  the  Markov  process  (MarginalsMovement . m)  to  account  for 
the  fact  that  targets  could  have  moved  during  the  current 
timestep.  Finally,  the  function  determines  which  moves  to 
recommend  for  the  next  timestep  with  the  current  state  and 
detection  matrix  (MultiSearcherMove . m) . 

B.  INITIALIZEMARGINALS.M  FUNCTION 

This  function  only  serves  a  purpose  for  the  actual 
experiment.  It  is  a  way  to  initialize  the  target  marginals 
before  an  experiment  begins.  It  takes  as  an  input,  the 
number  of  targets  that  are  going  to  be  involved  in  the 
experiment  and  returns  the  resulting  initial  target 
marginals.  For  our  experiments,  we  assumed  a  target  was  ten 
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times  more  likely  to  start  at  a  node  than  on  a  road,  but 
this  value  is  completely  dependent  on  the  conditions  of  the 
scenario . 

The  function  calculates  these  initial  conditions  by 
creating  an  integer  count  on  each  arc  (i,j)  to  represent  how 
likely  it  is  to  start  there.  Giving  a  value  of  10  to  each 
node,  1  to  each  road,  and  zero  at  every  other  (i,j)  .  It  then 
divides  by  the  sum  total  of  the  entire  matrix  to  convert 
these  counts  into  probabilities.  Finally,  it  sets  these 
probabilities  for  all  targets. 

C .  AREASEARCH . M  FUNCTION 

This  is  a  simple  function  that  determines  the 
probability  of  detection  at  a  node  given  the  time  spent 
searching  at  the  node  as  well  as  the  speed  and  sweepwidth  of 
the  searcher.  It  does  this  by  using  the  random  search 
formula  assuming  a  circular  search  area  of  radius  one- 
quarter  mile  around  the  node.  We  assumed  a  random  search  to 
calculate  a  lower  bound  on  the  actual  probability  of 
detection.  This  function  is  used  in  the  SearcherMove . m 
function  to  help  determine  how  much  probability  mass  would 
be  collected  by  a  certain  move. 

D.  SEARCHERMOVE. M  FUNCTION 

This  function  takes  in  the  state  and  characteristics  of 
one  particular  searcher  as  well  as  a  list  of  nodes  not 
available  for  this  searcher  at  this  time.  It  returns  the 
searchers  best  first  and  second  moves  (second  move  refers  to 
the  move  in  the  next  timestep,  which  will  be  reoptimized 
based  on  the  actual  state  during  the  next  timestep) ,  as  well 
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as  how  much  probability  mass  these  moves  collect  and  whether 
this  sequence  of  moves  takes  both  timesteps,  thus 
constraining  the  options  for  the  next  timestep' s  move. 

The  function  works  by  looping  through  all  nodes  and 
checks  which  ones  the  searchers  are  able  to  transit  or 
conduct  a  road  search  to  during  the  next  timestep.  It 
accomplishes  this  by  using  two  nested  "for"  loops.  It  then 
updates  the  target  marginals  with  a  negative  Bayesian  update 
function,  thus  inherently  assuming  no  detections  were  made 
in  this  timestep  in  order  to  get  a  more  accurate  estimate  of 
the  state  for  the  next  timestep  (this  assumption  is  not  made 
during  the  reoptimization  of  the  future  move,  it  is  merely 
made  now  for  a  more  accurate  representation  of  the  future 
state) .  The  function  then  uses  two  more  nested  "for"  loops 
inside  of  the  other  two  to  calculate  every  sequence  of  two 
moves  (still  including  the  option  of  either  transiting  or 
searching  the  road)  and  determines  the  reward  of  doing  such 
a  sequence  of  moves.  If  the  sequence  of  moves  the  function 
is  currently  examining  is  better  than  any  previous  sequence, 
it  stores  these  moves  as  the  current  best.  It  then  repeats 
this  process  until  all  moves  have  been  checked. 

E.  MULTISEARCHERMOVE.M  FUNCTION 

This  function  takes  in  the  number  of  searchers  and 
their  characteristics  as  well  as  the  state  at  the  current 
time.  It  returns  the  recommended  move  for  the  current 
timestep  for  each  searcher  and  whether  or  not  that  searcher 
will  be  blocked  (constrained  to  continue  along  that  search) 
for  the  next  timestep. 
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The  function  accomplishes  this  by  repeatedly  calling 
the  SearcherMove . m  function  with  different  restrictions  for 
each  unconstrained  searcher  (searchers  can  be  constrained  if 
their  previous  move  limits  their  next  move,  i.e.,  they  are 
still  en  route  to  their  previous  destination,  or  if  they  are 
currently  inactive,  i.e.,  out  of  fuel  or  down) .  The 
function  first  limits  constrained  searchers  to  their 
appropriate  moves  and  then  updates  the  restricted  movement 
list  to  incorporate  these  moves.  It  will  get  the  optimal 
move  for  each  searcher  by  running  the  SearcherMove . m 
function  and  storing  these  optimal  moves.  If  there  are  no 
conf lictions ,  these  are  the  optimal  moves  for  the  searchers; 
if  there  are  conf lictions ,  the  function  will  then  update  the 
list  of  unavailable  moves  for  each  searcher  and  determine 
the  best  scenario  possible  using  these  conflicting 
searchers.  It  will  repeat  this  process  until  there  are  no 
conflictions  among  the  searchers  and  this  will  be  the 
recommended  movements  for  the  next  timestep.  This  iterative 
process  of  eliminating  possible  moves  and  recalculating 
optimal  moves  for  each  searcher  can  save  orders  of  magnitude 
in  runtime  over  the  total  enumeration  method  for  all 
searchers  combined  which  tries  many  moves  that  are  nowhere 
near  optimal  strategies.  Even  in  the  TTLP,  total 
enumeration  for  a  real-time  experiment  can  take  too  long, 
thus  this  iterative  optimal  move  process  is  an  extremely 
important  process  of  the  ASOM  algorithm. 

F.  POSITIVE BAYESIANPERM .M  FUNCTION 

This  function  takes  in  the  current  target  marginals  and 
a  matrix  of  all  the  detections.  It  returns  the  resulting 
target  marginals  after  updating  for  the  positive  detections 
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in  the  current  timestep.  It  is  only  appropriate  to  use  this 
function  when  all  targets  are  of  the  same  type,  the  more 
general  type  of  this  function  and  the  one  that  is  used  in 
practice  is  PositiveBayesianPermutations . m . 

This  function  works  by  creating  a  matrix  of  all 
different  (unordered)  combinations  of  targets  that  could 
have  been  seen  during  the  timestep  using  the  nchoosek.m 
MATLAB  library  function.  Next,  for  each  different 

combination  (each  row  of  the  previously  created  matrix)  it 
creates  all  different  permutations  (ordered)  of  that 
combination  using  the  perms. m  MATLAB  library  function.  It 
combines  all  of  these  different  permutations  into  one  big 
matrix  of  all  possible  permutations  for  the  target 
detections  of  the  current  timestep.  It  is  important  to 
notice  that  these  permutations  represent  all  of  the 
different  possible  scenarios  of  target  detections.  The 
function  then  determines  the  probability  of  each  of  these 
scenarios  occurring  by  multiplying  together  the  target 
marginals  of  each  detected  target  at  the  location  it  was 
supposedly  detected  then  normalizing  by  dividing  each 
probability  by  the  sum  total  of  all  probabilities.  After 
determining  and  normalizing  the  probabilities,  the  function 
decides  which  scenario  actually  occurred  (according  to  the 
model/algorithm's  viewpoint)  based  on  a  random  number  draw. 
Now  that  the  algorithm  has  the  scenario  that  occurred  picked 
out,  it  updates  the  target  marginals  for  all  targets  that 
were  detected  to  be  one  at  the  arc  they  were  detected  and 
zero  everywhere  else,  thus  spiking  the  probability  of  those 
targets . 
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G.  POSITIVEBAYESIANPERMUTATIONS.M  FUNCTION 

This  function  takes  in  the  current  target  marginals  and 
an  array  containing  the  information  of  each  target  type,  as 
well  as  a  list  of  all  detection  locations  and  the  type  of 
detection  made  at  each  location.  It  returns  the  resulting 
target  marginals  after  all  positive  Bayesian  updates  have 
been  made . 

The  function  works  by  creating  new  temporary  target 
marginal  matrices  with  an  extra  index  representing  all 
possible  types  of  targets.  This  will  create  many  blank  (by 
blank,  we  mean  no  nonzero  entries)  levels  of  the  target 
marginals  of  each  type,  as  there  will  only  be  nonzero 
entries  if  the  target  type  of  the  marginals  index  matches 
the  actual  type  of  the  target.  In  a  similar  manner,  the 
function  also  creates  a  temporary  detection  matrix  with  an 
extra  index  to  indicate  detections  of  a  certain  type  of 
target.  Next,  the  function  calls  the  PositiveBayesianPerm . m 
function  for  each  type  separately,  meaning  where  the 
PositiveBayesianPerm . m  function  is  expecting  the  input  of 
the  target  marginals  and  a  matrix  of  detections,  we  only 
give  it  one  level  of  the  temporary  target  marginals  and 
temporary  detection  matrix  by  holding  the  type  index  fixed 
at  its  current  value  and  looping  through  all  possibilities. 
This  updates  the  temporary  target  marginals  for  each  type 
separately,  but  since  all  values  were  zero  except  for 
targets  whose  type  matched  the  current  type  index,  we  simply 
have  to  sum  over  the  type  index  to  return  the  final  value  of 
the  actual  target  marginals  updated  for  positive  detections. 
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H. 


NEGATIVE BAYESIAN . M  FUNCTION 


The  NegativeBayesian . m  function  is  the  Bayesian  update 
for  nondetection  function.  This  is  the  traditional  use  of 
Bayesian  updating  as  described  in  the  introduction.  It 
takes  all  values  of  target  marginals  where  there  was  no 
detection  and  updates  them  for  the  failed  detection.  The 
function  returns  the  updated  values  of  the  target  marginals. 

The  function  accomplishes  this  by  looking  at  every 
value  of  the  target  marginals  that  is  less  than  1,  meaning 
if  there  was  a  detection  there  (thus  giving  a  probability 
spike  equal  to  1),  do  not  apply  negative  Bayesian  updating. 
If  the  value  of  the  target  marginal  is  less  than  1,  the 
function  updates  this  probability  to  its  previous  value 
multiplied  by  the  probability  of  failed  detection  (1 
probability  of  detection) .  After  updating  the  probability 
of  each  target  marginal,  the  function  normalizes  each  value 
by  dividing  it  by  the  sum  total  of  the  new  probabilities. 
The  result  is  the  new  target  marginals  updated  for  failed 
detections . 

I.  MARGINALSMOVEMENT . M  FUNCTION 

The  MarginalsMovement . m  function  takes  in  the  current 
target  marginals  as  well  as  the  speed  of  each  of  the  targets 
and  returns  the  updated  values  of  the  target  marginals  after 
incorporating  possible  movement  for  the  current  timestep 
based  on  the  Markov  movement  matrix. 

The  function  accomplishes  this  by  looping  through  each 

target  and  another  loop  through  each  arc  (i,j)  for  that 

target.  First,  it  updates  every  arc  to  the  new  value  based 

on  movement  out  of  it  for  the  next  timestep  by  multiplying 
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by  the  movement  matrix  directly.  Next,  it  updates  the 
values  of  nodes  that  have  some  probability  moving  into  them 
from  adjacent  roads.  After  that,  it  multiplies  the  values 
on  roads  by  (1  -TURN)  probability  to  lessen  the  values  on 
arcs  where  the  target  could  possibly  turn  around.  Finally, 
on  every  arc  where  it  lowered  the  probability  to  account  for 
targets  turning  around,  it  raises  the  probability  on  the 
reverse  arc  by  the  corresponding  amount. 

J.  MOVEMENT. M  FUNCTION 

The  Movement. m  is  one  of  two  functions  to  help  model 
the  target  movement  for  experimentation.  It  is  not  actually 
used  in  the  step  function,  nor  during  the  actual  experiment, 
but  rather  to  aid  in  the  generation  of  random  routes  for 
targets  to  travel  during  experimentation.  It  is  called  in 
the  Targe tMovement .  m  function  to  return  the  next  move  of  a 
target  that  needs  a  new  destination.  It  takes  in  the  old 
position  of  the  target  and  the  Markov  movement  matrix.  It 
returns  the  new  destination  node  of  that  target. 

This  function  works  by  looking  at  the  Markov  movement 
matrix  in  the  row  of  the  starting  position  of  the  target 
(which  will  sum  to  1,  by  definition)  and  making  a  random 
draw  from  a  uniform (0,1)  distribution.  With  this  random 
number,  the  function  returns  the  column  of  the  number  whose 
cumulative  probability  matches  with  the  random  number  drawn. 

K.  TARGETMOVEMENT.M  FUNCTION 

The  second  of  two  functions  made  to  model  target 
movement  for  experimentation.  It  takes  in  the  amount  of 
time  the  targets  will  move  around,  the  number  of  targets,  a 
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speed  array  containing  the  speed  of  each  target,  and  the 
starting  positions  of  the  targets.  It  returns  the  final 
positions  of  the  targets  after  it  has  moved  for  the  amount 
of  time  input.  The  output  matrix  has  one  row  for  each 
target  and  three  columns  with  the  first  two  representing  the 
start  and  finish  nodes  of  the  current  arc  the  target  is  on 
(if  start  and  finish  nodes  are  equal,  the  target  is 
stationary  at  that  node)  ,  and  the  third  being  how  many 
timesteps  the  target  has  remaining  on  that  arc  before 
completing  it.  If  the  user  would  like  to  see  every  movement 
in  the  sequence,  just  repeatedly  run  the  function  with  end 
time  equal  to  one  timestep  and  update  the  start  positions 
with  the  output  positions  from  the  previous  step. 

This  function  works  by  entering  a  "while"  loop  until 
the  simulation  time  reaches  the  end  time  input.  It  then 
loops  through  each  target  to  update  their  positions  one  at  a 
time.  If  the  current  target  is  stationary  at  a  node,  it 
calls  the  movement  function  to  get  a  new  destination  node 
(which  could  be  to  remain  at  the  same  node  for  another 
timestep)  ,  otherwise  the  target  remains  on  the  road  it  was 
previously  located.  It  then  makes  a  draw  from  a 

uniform (0,1)  distribution,  if  this  random  draw  is  less  than 
the  turn  probability,  the  function  reverses  the  arc  and 
number  of  moves  remaining  to  complete  that  arc,  otherwise 
the  function  only  updates  the  number  of  moves  remaining 
until  completion  of  its  current  arc.  Finally,  the  function 
stores  all  of  the  new  information  in  the  output  matrix  and 
increments  time  for  the  next  timestep. 
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